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Abstract 



Estimating the spectra of non-stationary signals represents a difficult challenge. 
Classical techniques employing the Fourier transform and local stationarity have been 
employed with limited success. A more promising approach is the use of time-frequency 
distributions. The majority of useful distributions have been unified under Cohen's class 
of distributions, a bilinear transformation with an arbitrary, fixed kernel function. The 
properties of several popular distributions developed from Cohen's class of distribution 
are examined. The ability of the kernel to suppress spurious cross-terms resulting from 
the bilinear nature of these distributions is examined along with their characteristics. 
Distributions employing a fixed kernel usually give good results only for a small class of 
signals. A data adaptive kernel is also examined which promises to give superior results 
for a broad class of signals. Results are shown for several test cases employing synthetic, 
analytic signals. 
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I. INTRODUCTION 



A. PROBLEM STATEMENT 

Currently, analysis of non-stationary spectra is an important tool in the field of signal 
processing. The goal of such an analysis is to obtain a time history of a signal's 
frequency content and track statistical changes. Such insight into a signal’s temporal 
behavior is extremely useful in signal detection, identification, and synthesis. These 
applications are of primary importance to the fields of speech processing, acoustic 
processing, and seismic analysis among others. 

Classical methods for analyzing spectra are derived from the Fourier transform. 
Fourier analysis through the Fast Fourier Transform (FFT) is the tool of choice for 
stationary analysis due to its ease of implementation. However, the assumption of 
stationarity precludes any notion of time dependence. If the incoming signal contains 
amplitude or frequency components which vary with time, the changes are masked. 

Fourier analysis extends to time-frequency analysis through the assumption that local 
stationarity exists. This is the basis for the spectrogram. The data is segmented through 
a window to a length chosen to ensure stationarity. The Fourier transform of the 
windowed data allows the determination of the signal’s energy distribution as a function 
of frequency at a given time. Sliding the window along the data results in the generation 
of a time-frequency surface. The primary drawback to the spectrogram is that frequency 
resolution varies directly with the window length. Temporal resolution is obtained at the 
expense of frequency resolution. The spectrogram is inadequate for rapidly varying 
spectra. 

An alternative method of analyzing time- varying spectra lies in the use of a joint 
time-frequency distribution. A time-frequency distribution represents a function 
describing the energy density of a signal simultaneously in the time and frequency 
domains. At first glance, the distribution appears to represent a statistical artifice. 
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Ideally, a joint distribution has the same properties as a joint density function and may be 
manipulated in a like manner. For example, the distribution of energy with frequency 
can be obtained by integrating over time. However, time-frequency distributions do not 
have to be strictly valid in a statistical sense to be of use. Several useful distributions to 
be encountered later illustrate this feature. As long as the distribution obeys certain 
desirable properties that allow consistent interpretation of the power spectrum it serves a 
valid purpose. 

Many time-frequency distributions have been proposed, the majority of which can be 
unified under a class of distributions proposed by Cohen in 1966. The Cohen class of 
distributions is a bilinear transformation characterized by the use of an arbitrary kernel 
function. Depending on the choice of kernel function, various distributions have been 
proposed, each with desirable and undesirable properties. Well known distributions of 
this class include Wigner-Ville, Choi-Williams, Bom-Jordan, and the ZAM 
transformation. The spectrogram can also be considered to be a member of Cohen's 
class. 

Although Cohen's class of distributions has many important characteristics, the 
bilinear structure of the class can produce a combination of auto-terms and undesirable 
cross-terms when the signal is composed of multiple frequency components. The 
presence of cross-terms obscures spectral features necessary for signal recognition or 
classification. The degree of interference present depends on the kernel employed in a 
particular distribution. Choosing a kernel demands tradeoffs. While cross-terms need to 
be suppressed, the properties desired of a reasonable time-frequency distribution should 
be retained. Typically a compromise is reached by sacrificing select properties to obtain 
reasonable suppression. The resulting distribution has limitations but typically performs 
well for certain classes of signals. 
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B. OBJECTIVES 

The goal of this thesis is to examine the performance and limitations of selected 
kernels used with the Cohen class of distributions. The methods examined here are those 
of Wigner-Ville, Choi-Williams, and the ZAM kernel. Each method has its own 
personality and areas of application, yet each represents a particular set of compromises. 
The Wigner-Ville method excels when used with single-component linear FM signals but 
is unable to suppress the cross-terms that arise with spectrally complex signals. Both the 
Choi-Williams and ZAM kernels suppress cross-terms to some degree but do not offer 
good performance for a broad class of signals. However, individuals signals have their 
own particular characteristics which may be characterized in the ambiguity domain. A 
more effective time-frequency distribution might attempt to take advantage of the signal’s 
characteristics. One promising method employing a Gaussian signal-dependent kernel is 
examined here. By tailoring the kernel to the location of a signal's auto-terms on the 
ambiguity plane, good performance for a broad class of signals is possible. 

Performance of the methods above is illustrated graphically using several classes of 
synthetic analytic signals. 
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II. SPECTROGRAM 



For a band-limited, wide-sense stationary (WSS) process, the Wiener-Khinchin 
theorem [1] states that the Fourier transform of the autocorrelation function yields its 
power spectral density (PSD): 



rjf)= ) 



(i) 



With a finite data set, the PSD is calculated directly from the data. The periodogram 
estimates the PSD as the magnitude squared of the Fourier transform of the data: 



£,(/) = 7 

1 0 

In discrete form, (2) becomes 



( 2 ) 






AM 



YjX(n)e 



-j2nfn 



n = 0 



(3) 



where the discrete Fourier transform is typically calculated by an FFT algorithm. 

The periodogram can approximate a time-frequency surface by separating the data 
into contiguous blocks and processing each separately. Each block produces a spectral 
estimate. Laying the spectral lines beside one another yields an estimate of the time- 
frequency surface. Frequency resolution is directly affected by the length of the blocks 
used. Long blocks give better resolution but tend to smoothen nonstationarities. Shorter 
blocks track nonstationarities better but at the expense of frequency resolution. 

A time-frequency surface generated with the periodogram provides only a weak 
association between time and spectral behavior. The spectrogram [2] allows a direct 
association between time and spectral estimates and is a true time-frequency 
representation. The spectrogram segments data through a sliding window, centered about 
time t: 



4 



2 









( 4 ) 



Window duration is chosen to ensure local stationarity exists and provides a means of 
controlling the frequency resolution. The spectral estimate is both real-valued and 
positive. For discrete time, (4) becomes 



£,(*./) 



JV-l l 

Y J x{n)w{n-k)e- j7 " fn . 

rt - 0 



(5) 



A measure of the spectrogram's accuracy as a time-frequency representation may be 
found by determining expressions for the instantaneous energy and energy density 
spectrum. These are found by integrating over frequency and time respectively. 
Integrating over frequency gives 



] P xx (t,f)df = ]\x(r)\ 2 w 2 (r-t)dr 
while integrating over time yields 



( 6 ) 



j 1 0, f)dt = f \x(of\w(o - f) I 2 do. (7) 

These expressions show that the sliding window causes smearing along both the time and 
frequency directions. 

Another check of the spectrogram is to calculate the total signal energy, represented 
by the volume under the spectrogram. Integrating over both time and frequency results 
in 



j J P xx (t,f)dtdf = "j'j\x(T)\ 2 w 2 (r-t)dTdt (8) 

which shows that the spectrogram does not accurately represent the signal energy 
whenever the average energy of the window is not equal to unity. 
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III. GENERALIZED TIME-FREQUENCY 
REPRESENTATIONS 
(GTFR) 



A. TIME-FREQUENCY DISTRIBUTIONS 

As noted previously, the spectrogram suffers from two principal drawbacks. First, the 
time and frequency resolutions of the method are inversely related. Second, the 
assumption of local stationarity may not always be valid. In that case, the frequency 
distribution of the signal no longer represents the true PSD. 

One approach to handle non-stationary signals is to introduce a time-dependent 
correlation function into the Wiener-Khinchin theorem. In general the autocorrelation 
function may be defined as 

1 

R xx (t l ,t l + T) = jjx(t)x\t + x)dt (9) 

where r, represents an arbitrary reference time and the superscript * denotes complex 
conjugate. A symmetrically specified correlation function may be developed by defining 

t.=t~— and t 7 =t + — (10) 

1 2 2 2 

which in turn gives 

X = t 2 -t l and t = ~~~ i L • (11) 

Using these definitions in (9) yields a time indexed autocorrelation function: 






= E X (t+l)x\t-h 
2 2 



( 12 ) 



Using an instantaneous autocorrelation value in the Wiener-Khinchin theorem results in 
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(13) 



PM)=]x(t + ^)x(t-^)e-' 2 ^dx 

which is the well known Wigner-Ville distribution [3]. This distribution will be 
examined later. 

A different approach in handling nonstationary signals is to express signal energy as a 
joint function of time and frequency. The benefit of formulating a time-frequency 
distribution, T(t,f), is that it can be endowed with properties desirable for the purpose of 
signal processing. One desirable property for a time-frequency distribution is that it 
represents a true energy distribution. An energy distribution requires three relationships 
to hold. First, the time marginal of the energy distribution represents the energy density 
spectrum: 

[T(t,f)dt = \X(f)\ 2 . (14) 

Second, the frequency marginal gives the instantaneous energy: 

jT(t,f)df = \x(t)\ 2 . (15) 

Finally, integrating over time and frequency gives the total energy of the signal: 

\ / \T(t,f)dtdf = \j i x(t)\ 2 dt = E x (16) 

As noted previously, the spectrogram represents a smeared version of the energy 
distribution. The degree of smearing depends on the window employed and on the non- 
stationarity of the signal. 

Other properties desired for a time-frequency distribution fall into two categories. 

The first are key properties without which physical interpretation of the time-frequency 
plane would be impossible. An example is shift invariance. The remaining properties, 
such as time support, are less critical and are not absolutely required for a distribution to 
be useful. In fact, not all of the properties can even be supported simultaneously. A 
listing of desired properties is as follows [4]: 



n 



• Time Shift : If the signal is shifted in time, the time-frequency distribution is also 
shifted in time by the same amount. 

*(*)-> T(t,f) 

x(t-t 0 )->T(t-t 0 ,f) 
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Frequency Shift : If the signal is shifted in frequency, the time-frequency distribution 
is also shifted in frequency by the same amount. 



• Real-valued : The time-frequency distribution is be real everywhere. 

7Xr,/)e9S 

• Non-negative : The time-frequency distribution is positive everywhere. 

T(t,f)>0 V(r,/) 



(18) 



(19) 

( 20 ) 



• Time support : If the signal is zero at some point in time, the time-frequency 
distribution is also zero the same time. 

*(/ 0 ) = 0-»7Xr 0 ,/) = 0 (21) 



• Frequency Supp ort: If the signal has a spectral energy density of zero at some 
frequency, the time-frequency distribution is zero at the same frequency. 

X(f 0 ) = 0-*T(t,f 0 ) = 0 (22) 

• Instantaneous Frequency : The instantaneous frequency of a signal at a given point in 
time is equal to the normalized first-order moment in frequency of the time- 
frequency distribution. 



m= 



jT(tJ)df 



(23) 
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TABLE 1: DESIRABLE PROPERTIES FOR TIME-FREQUENCY DISTRIBUTIONS 



Property 

Number 


Property 


Equation 

Number 


PI 


Time Marginal 


[nt,f)dt=\x(f)\ 2 


14 


P2 


Frequency Marginal 


jT(t,f)df = \x(t )\ 2 


15 


P3 


Total Signal Energy 


l f \T(tJ)dtdf = [\x(tfdt = E x 


16 


P4 


Time Shift 


x(t) — » T(t,f) 
x(t-t 0 ) — » T(t-t 0 ,f ) 


17 


P5 


Frequency Shift 




18 


P6 


Real-valued 


T{t,f) € 91 


19 


P7 


Non-negative 


> 

o 

Al 

N 


20 


P8 


Time support 


x{t 0 ) = 0->T(t 0 ,f) = 0 


21 


P9 


Frequency Support 


X(f a ) = 0 — * T(t,f 0 ) = 0 


22 


P10 


Instantaneous 

Frequency 


(mi ,m 


23 


Pll 


Group Delay 


\tT{tJ)dt 

'■ (/)= J ru.m 


24 



• Group Delay : The group delay of a signal at a given point in frequency is equal to 
the normalized first-order moment in time of the time-frequency distribution. 



',(/) = 



\ntj)dt 



(24) 



All of the properties discussed are summarized in Table 1 and denoted Pl-Pl 1 for 
later reference. 



B. COHEN S CLASS OF DISTRIBUTIONS 

A wide variety of time-frequency distributions have been proposed, each with their 
own unique set of properties. Although the distributions proposed were all based on 
valid principles, their relationship to each other was unclear. In 1966, Cohen proposed a 
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generalized phase-space distribution function from which most popular distributions can 
be derived [5]. The class of distributions is given by 

C(t,f)= J ]]mr)x(u + l)x\u-l)e i2 * ( *-«-* ) dudTde (25) 

where <f>(0,T) represents an arbitrary kernel function represented in the ambiguity domain 
(Doppler-shift, time lag). Depending on the choice of kernel function in (25), literally an 
infinite number of different distributions may be generated. 

The relationship given by (25) allows identification of those properties common to all 
member distributions. For example, the bilinear structure gives rise to spurious cross- 
components with multi-component signals. However, the individual properties of a 
particular distribution are determined by its kernel function. More importantly, by 
placing constraints on the kernel function a desired set of characteristics is obtained in 
the resulting distribution. Referring back to the properties Pl-Pl 1 in Table 1, each 
imposes a different constraint on the kernel function. Table 2 shows the constraints 
necessary to achieve each property [4][6]. Since some of the constraints are mutually 
exclusive, choosing a kernel function ultimately involves trade-offs. Cohen [6] gives 
derivations of selected constraints. 

In addition to (25), Cohen's generalized distribution can be expressed in four 
additional ways [4]. First, the generalized distribution represents the two-dimensional 
convolution of the Wigner-Ville distribution with the kernel function represented in the 
time-frequency domain: 

= (26) 
Second, the generalized distribution can be written as the double Fourier transform of the 
product between the kernel expressed in the ambiguity domain and the ambiguity 
function: 
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TABLE 2: KERNEL CONSTRAINTS TO OBTAIN DESIRABLE PROPERTIES 



Property 


Kernel Constraint 


Constraint Number 


Time Marginal 


00 , 0 ) = i ve 


Q1 


Frequency Marginal 


0(O,x) = 1 Vx 


Q2 


Total Signal Energy 


0(0,0) = 1 


Q3 


Time Shift 


0(0, t) independent of t 


Q4 


Frequency Shift 


0(0, x) independent of f 


Q5 


Real-valued 


0(0, x) = 0* (— 0,— x) 


Q 6 


Non-negative 


F e ,[<Ke,t)]>o 


Q7 


Time support 


vy(r,x) = 1 0(0,x)e“‘' 2rce '^0 = 0 
for |x| < 2|/| 


Q 8 


Frequency Support 


J(K0,TV iI,ft rfT = O 
for |6|c2|/| 


Q9 


Instantaneous Frequency 


<)>(0,O) = 1 V0 
and 

_ n 

3x U 


Q10 


Group Delay 


0(O,x) = 1 Vx 
and 

30 9=0 


QH 



C(f,/)= J ^{Q,x)A{Q,x)e' i2m+zf) dxdQ 



(27) 



where 



A(6, T)= J x(u + ^)x‘ (u - ^)e ,2mj0 du. (28) 

The product of the kernel and the ambiguity function is known as the generalized 
ambiguity function and represents the characteristic function for (25). In the temporal 
domain, the kernel function becomes 



\\i(t,x)= j<) )(Q,x)e~ i2 ^'dQ. 



(29) 



Then, another expression for the generalized distribution is the one-dimensional Fourier 
transform of the convolution between the temporal domain kernel and the instantaneous 
autocorrelation: 



C{Uf) = \[R„{ux)*\!f{ut)Y Ml dx 



(30) 



where 






(31) 



Finally, in the spectral domain, the kernel function becomes 



v(/,e)= J<K-e,T>-' 2 ^T. 



(32) 



With this kernel, the last expression for the generalized distribution is the one- 
dimensional inverse Fourier transform of the convolution between the spectral domain 
kernel and the instantaneous spectral autocorrelation: 



Figure 1 illustrates the relationships between each of the representations shown above. 
While Cohen's generalized distribution can be expressed in equivalently in the four 
domains, some domains may more readily lend themselves to implementation. 

C. THE AMBIGUITY FUNCTION 

The ambiguity function (28) represents a frequency-indexed autocorrelation function 
[7], When multiplied by the kernel function, the generalized ambiguity function results: 




(33) 



where 



/?^(/,0) = X(/ + |)X*(/-|). 



(34) 



A'(e,T)=<t>(e,T)A(e,T). 



(35) 
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Rxx(fB)*'¥(fQ) R^(t,x)*y(t,x) 




<))(0,xM(e,T) 




Legend 

fl- Fourier transform w.r.t. 
first variable 

iF2: Fourier transform w.r.t. 
second variable 



Figure 1 . Relationships between Cohen's generalized distribution in various domains 



As a distribution, the generalized ambiguity function is not useful. However, as the 
characteristic equation of Cohen's generalized distribution it is an important tool in 
determining the properties of the distribution. 

Referring back to (28), the ambiguity function is bilinear with respect to the signal. 

As such, it exhibits undesirable cross-terms with multicomponent signals. Consider the 
following signal expressed as the sum of its individual components: 

*(0 = X**(0- (36) 

*=i 

Substituting (36) into the ambiguity function results in auto-terms and cross-terms 
between the components: 

>l(e.T) = X A xixl (e. I) + _ (0, T) (37) 

1=1 l*m 

v v ' v * ' 

auto- terms cross- terms 

where 

4,*,(0>t)= ]x,(u + ^)xi(u-^)e j2mje du (38) 

and 
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(6,t)= jx,(u + |)x* (u - l)e i2m9 du. 



(39) 



Ignoring the kernel function for now, when the ambiguity function is transformed to the 
time-frequency domain using (27), the cross-terms persist as spurious artifacts in the 
time-frequency distribution of the signal. Signal interpretation and identification become 
more difficult. 

An approach to remove the degradation in the time-frequency domain by cross-terms 
is to eliminate them in the ambiguity domain. Flandrin [8] has noted that the auto-terms 
(38) are located primarily about the origin of the ambiguity plane. Cross-terms (39) tend 
to be positioned away from the origin at a distance related to the distance between the 
signal components involved on the time-frequency plane. Recalling (35), the above 
suggests that the kernel function should apply a large weight near the origin to promote 
auto-terms and a small weight away from the origin to suppress cross-terms. Therefore, 
to reduce interference in the time-frequency domain, the kernel function should be a two- 
dimensional lowpass filter in the ambiguity domain. 

Two simple examples illustrate the structure of signals in the ambiguity domain. First 
consider a signal composed of two complex sinusoids: 



where the first term represents the auto-term and the last two, cross-terms. As (41) 
indicates, only the auto-term passes through the origin and lies directly on x axis. The 
cross-terms parallel the x axis but never come close to the origin. Figure 2 shows a plot 
of the ambiguity function for 



j c(t) = V W + V W - 



(40) 



Substituting (40) into the ambiguity function results in 



A(6,x) = (A,V 2 ^ t + AjV 2 ’* 1 )5(0) + A, + (/, - f 2 )) 
+A l A 2 e i2 « f '+ f *W + (f 2 -j i)) 



(41) 




( 42 ) 
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Figure 2. Ambiguity function of Equation (42) 
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which demonstrates the behavior expected by (41). 

Consider next a multicomponent signal composed of linear chirps. Figure 3 shows 




Figure 3. Ambiguity function of Equation (43) 
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the ambiguity function for 

x{n ) = 4e A ^ W6 ' + 4 e Hfa) (43) 

where the time period was chosen to ensure the chirps do not cross in the time-frequency 
plane. The auto-terms are seen to cross diagonally through the origin while the cross- 
terms lie far away. However, if the components of the signal touch or cross on the time- 
frequency plane, cross-terms of the ambiguity function will appear at the origin. Figure 
4 shows the ambiguity function for 



x(n) = 4e ^ sh56) + 4e ^ 8 hse) (44 

and clearly shows cross-terms passing through the origin. 

Reflecting upon Figures 2-4, the shape of the kernel’s passband must be chosen with 
care to do the best job at removing cross-terms while preserving auto-terms. Consider, 
for example, the situation portrayed in Figure 5. Figure 5(a) shows the ambiguity 
function of a signal composed of two linear chirps. The signal is to be Filtered in the 
ambiguity domain to remove cross-terms using a kernel possessing an elliptic passband. 




Figure 4. Ambiguity function of equation (44) 
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In Figure 5(b), the major axis of the kernel is aligned along the T-axis, removing all of 
the cross-terms. The resulting time-frequency distribution will be free of cross-terms and 
show good auto-term resolution. But, if the major axis of the kernel is changed to the 0- 
axis as in Figure 5(c), the cross-terms are not fully filtered and the auto-terms are 
partially suppressed. The resulting time-frequency distribution will show cross-terms as 
well as smoothed auto-terms. For the best time-frequency representation from Cohen's 
generalized distribution, the kernel function must take into account the signal's structure 
on the ambiguity plane. Of course, if cross-terms occur at the origin, as in Figure 4, 
removing them is very difficult and the signal's time-frequency distribution will not be 
totally satisfactory. 
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(a) 





Figure 5. 



(a) Ambiguity function of two linear chirps 

(b) Filtering of the ambiguity function with an elliptic kernel (shaded 
region) oriented along the x-axis 

(c) Filtering of the ambiguity function with an elliptic kernel oriented 
along the 0-axis 
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IV. FIXED KERNEL DISTRIBUTIONS 



A. WIGNER-VILLE DISTRIBUTION 

The Wigner distribution [9] was originally proposed in the field of quantum 
thermodynamics in 1932 as a correction to the behavior of atoms at low temperatures. 
Ville [10] reintroduced the distribution in 1948 for use in signal processing and 
demonstrated its use with analytic functions. More recently, the Wigner-Ville 
Distribution (WD) has been studied extensively by Boashash [11] and Claasen and 
Mecklenbraucker [ 1 2][ 1 3]. 

The WD is derived from Cohen’s generalized distribution using the kernel function 

<K9,i) = l. (45) 

Substituting (45) into (25) gives 



WD{t,f)= ] \ ]x(u + ^)x\u~)e i2 ^ u -*'- xf) dxdud$ 

= ]]*<“+ § )x' (u - —)e~ i2,nf d(u - t)dxdu (46) 

= ]x(t + ^)x‘(t-^)e~ i2,nf dx. 



As shown earlier, the WD may be interpreted as the Fourier transform of an 
instantaneous, symmetrical correlation estimate through the Wiener-Khinchin theorem. 
The WD also enjoys a simple relationship with the ambiguity function (28). Since the 
kernel function is equal to unity, (27) indicates the WD and the ambiguity function are 
related by a two-dimensional Fourier transform: 

jri 

WD(t,f) *' A( e,T). (47) 

<— F 
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Due to its simple structure, the WD is a convenient means for calculating the ambiguity 
function. 

In contrast to the periodogram, the WD possesses high temporal and frequency 
resolution simultaneously. Referring back to Table 2, the WD's kernel function ensures 
that the distribution obeys all of the properties listed with two exceptions. A non- 
negative distribution is not assured, a factor which limits the WD's usefulness as an 
energy distribution. Also, finite time support is violated in some cases. For example, if a 
signal contains short duration null intervals, the WD will not be zero during these 
intervals. Figure 6 shows the WD of an on-off keyed complex sinusoid and indicates the 
region over which the signal is actually zero. 

The main drawback to the WD are the cross-terms produced between frequency 
components due to its bilinear structure. Since the kernel function represents an all-pass 
filter in the ambiguity domain, the cross-terms are not attenuated in the time-frequency 
plane. Besides complicating the time-frequency distribution, most of its negative values 




Figure 6. Example of WD indicating signal energy where the 
signal is actually zero. 
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arise from the cross-terms. For a signal containing N frequency components, there are 



(AT| 



AM 



(Af-2)!2! 

cross-terms. As an example, recall (40), a signal composed of two complex sinusoids. 
Substituting (40) into (46) yields 



(48) 



= A 2 8(/ -f x ) + A 2 2 8(/ - f 2 ) 

f+f (49) 

+2 A, A 2 8(/ - ^yk)cos((/ 2 - /, )t). 

As (49) shows, and is generally true, the cross-term appears at the arithmetic mean of the 
frequencies of the two components involved. For this example, the magnitude of the 
cross-term is twice as great as the auto-terms if the auto-terms are of the same magnitude. 
Figure 7 shows the WD of the complex signal (42). 

The WD may be expressed in discrete form [13] as 

WD(n,f ) = 2 ]£ jt(/t + k)x\n - k)e- j4nif . (50) 

*=-« 

Unlike the spectra of discrete time signals, (50) is periodic in/with period n instead of 




Figure 7. WD of two complex sinusoids, Equation (42) 
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2k. Since the spectra of real valued signals is non-zero in the interval [0,27t], sampling at 
the Nyquist rate leads to aliasing of the spectrum. To avoid this, the signal must be 
sampled at twice the Nyquist rate, 

/.* 4 /-. ( 51 ) 

or the discrete time signal must be interpolated by a factor of two. 

An alternative that allows sampling at the Nyquist rate is to use only analytic signals 
which have a non-zero spectrum only in the interval [0,7t]. Every real valued signal x/n) 
has an associated complex valued analytic signal x(n) such that 

*,(n) = Re[x(/i)]. (52) 

The analytic signal is obtained from the real valued signal through the relationship [14] 
x(n) = x r (n) + jH[x r (n)] (53) 

where H[-] represents the Hilbert transform. A faster method is to take the Discrete 
Fourier Transform (DFT) of x r (n), zero out the negative frequencies, multiply the 
positive frequencies by two, and take the inverse DFT. An additional benefit of using 
analytic signals is reduction of cross-terms. Since only positive frequencies are present, 
the cross-terms arising from interaction between positive and negative frequency 
components are avoided. 

In actual practice, the data used in the discrete WD is typically windowed to 
smoothen the spectral estimate. The resulting distribution is known as the pseudo 
Wigner distribution (PWD): 

PWD(n,f ) = 2 jr jt(« + k)x’(n - k)w(k)w(-k)e' i4n¥ . (54) 

*=— 

Two-dimensional smoothing functions have also been developed to suppress cross-terms 
and obtain positive distributions [6]. 
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B. EXPONENTIAL DISTRIBUTION 



The exponential distribution (ED) was proposed by Choi and Williams [15], based on 
the kernel function 

eV 

<t> £C (6,x) = e « (55) 

where a is a positive scaling factor. Referring to Figure 8, (55) decays with increasing 
0T and acts as a two-dimensional lowpass filter. Accordingly, the ED demonstrates 
suppressed cross-terms while retaining most of the advantages of the WD. However, the 
ED does not obey the time and frequency support properties, and does not guarantee a 
positive distribution. 

Substituting the kernel function (55) into Cohen’s generalized distribution (25), an 
expression for the ED is obtained: 




Figure 8. Contour plot of the ED's kernel function in the 
ambiguity domain. 
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ED(t,f) = J J je ° x(u + ^)x\u-^)e l2K{6u ~ 6 ' xf> dxdudQ 




4 x/c , 



•x(u + —)x\u - —)du 
2 2 




( 56 ) 



The ED can be interpreted as the Fourier transform of a time-indexed autocorrelation 
estimate 

ED(t,f) = ]K(t,x)e- j2K *dx (57) 

where 

K(t,z) = J , 1 expf-^— p-l x(u + ^)x\u-hdu. (58) 

.•LV^nxVo t 4 x/c ) 2 2 

The autocorrelation estimate (58) represents a time average. To preserve the signal's 

time-varying properties, the exponential term applies a larger weight when u is close to t 

and a smaller weight when they are farther apart. To preserve accuracy, the range of the 

time average is controlled by x. For large values of t, a wider range is used and, 

conversely, a smaller range for small values of x. 

The kernel function performs Filtering in the ambiguity domain, emphasizing features 

lying close the origin and the axes. Width of the filter's passband is controlled by the 

scaling factor, a. For small values of a, the filter rolls off more sharply. Increasing o 

widens the passband and as o approaches infinity, the WD is obtained. The value of G 

also affects the autocorrelation estimate represented by (58). More averaging, and 

therefore more smoothing of the autocorrelation estimate, takes place as o is decreased. 

As o increases, (58) approaches an instantaneous autocorrelation estimate. Choosing a 
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value for a, then, imposes a tradeoff. Larger values of a give a sharper estimate of the 
autocorrelation function at the expense of weaker cross-term suppression. Smaller values 
tend toward the opposite. Choi and Williams have recommended that c be in the range 



To demonstrate the ability of the ED to suppress cross-terms, consider again (40), a 
signal composed of two complex sinusoids. Substituting (40) into (56) yields 



r)(/,^,/ 2 ,a) represents an attenuation factor that reduces the amplitude of the cross-term 
by spreading it along the frequency axis. The amplitude of the cross-term at a given 
frequency / depends on two factors. First, the amplitude is inversely proportional to the 
difference /, - f 2 . Second, the amplitude decreases exponentially with distance away 
from the cross-term's center frequency. A smaller value of a leads to a greater spreading 
of the cross-term while increasing o causes (59) to approach the WD result, (49), since 



0.1 to 10. 



ED{t,f) = A 2 b(f - + A 2 2 5(f - f 2 ) 

+2A,A 2 cos((f 2 ~f)t)n(f,f,f 2 ,o) 



(59) 



where 





(60) 



lim ri(/,yj,/ 2 ,a) = 8(/- 




2 



(61) 



Figure 9 shows the ED of the complex signal (42). 
The ED is expressed in discrete time [15] as 
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1 




(62) 



where VV^lx) and W M (u ) represent finite windows which slide along the time axis. 

VV^x) is an arbitrary, symmetrical window whose length and shape determine the 
frequency resolution of the distribution. The inner window, W M (u), is rectangular and 
determines the range over which the autocorrelation function is estimated. Like the WD, 
the discrete ED is periodic in /with period 4 n and is typically used with analytic signals. 

C. THE CONE-SHAPED (ZAM) KERNEL 

Although the ED offers suppression of cross-terms in the time-frequency domain, 
finite time support is sacrificed. To accomplish both goals simultaneously along with 
improved frequency resolution, Zhao, Atlas, and Marks [16] have proposed the cone- 
shaped (ZAM) kernel. The resulting distribution, however, does not satisfy time or 




Figure 9. ED of two complex sinusoids. Equation (42), 
using a = 10 
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frequency marginal properties. 



The kernel function is derived by considering the constraints necessary to achieve the 
desired properties. Referring back to Table 2, to maintain finite-time support the kernel 
must satisfy 



Therefore, the region of support for the kernel function in the (t,x) plane is limited to the 
cone-shaped region indicated in Figure 10. An appropriate choice for the kernel <})(r,t) 
rests on three considerations. First, for the best temporal resolution, the kernel should 
not be a function of time, t. This requirement assures that temporal smoothing does not 
take place. Second, to smooth the autocorrelation estimate and reduce cross-terms, the 
kernel should be a low-pass filter in the t-dimension. Combining these requirements 
with (64) gives 




(63) 



Equivalently, this constraint can be expressed in the temporal domain (r,x) as 

<K*,t) = 0, |t| < 2|/|. 



(64) 





(65) 



otherwise 



x 




Figure 10. The support region 
for the kernel (J)(r,T) 
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where g(x) is a suitable filter and a represents the slope of the cone subject to 2 < a < °° . 
The final consideration for the kernel is that it take the form of a lateral inhibition 
function in frequency,/, in the frequency plane (0,/). Lateral inhibition functions have 

been shown [17] to occur naturally in human vision and auditory systems, and enhance 
perception and feature detection. In signal processing, a function of this type represents a 
weighting scheme (shown in Figure 1 1) that enhances spectral peaks when convolved 
with the spectrum. Meeting the above also assures that the kernel will be low-pass in the 
0-dimension which aids in suppressing cross-terms. All of the above conditions can be 
satisfied by using any of the popular windowing functions for g(x). 

With a suitable choice for g(x), the kernel (65) can be expressed in the ambiguity 
domain [18] as 

4>(0,t) = |x|sinc(0x)g(x). (66) 

Substituting (66) into Cohen's generalized distribution (25), an expression for the cone- 
shaped kernel distribution (CSD) is obtained: 




Positive Weighting 



Negative Weighting 



f 



Figure 1 1. Weighting requirements for lateral inhibition 
functions 
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CSD(t,f) = 1 1 1 |t|sinc (0t)#(t).x(m + (u - ^)e ,2n{eu ^'^ f> dxdudd 

~ /+<t/2) (67) 

= Jg(t) | x(u + -)x'(u-—)e~ j2nfl dudx. 



The CSD can also be interpreted as the Fourier transform of a windowed, local 
autocorrelation function 



CSD(t,f) = j g(x)K(t,x)e~ j2vfz dx 



( 68 ) 



where 



K(t,x)= J x(u + — )x*(u — )du. (69) 

/-(t/2) 2 2 

The interval used to estimate the local autocorrelation function reflects the cone-shaped 
region of support and allows the CSD to track signals with rapidly varying nonstationary 
behavior. 

The CSD is expressed in discrete time [16] as 

L |*| 

CSD{nJ) = lY J g{k)e~ 1 ^ 2^x(n- p + k)x'(n- p- k). (70) 

*=-/. p=- 1*| 

To obtain a real distribution, the length of the window function g(k) must be odd. Since 
this implementation would preclude using an efficient FFT algorithm, (70) may be 
rewritten exploiting the symmetry of the window. Summing over one side of the 
window, (70) becomes 



CSD(n,f) = 4 Re 



YjS(k)R(n,k)e~ j4nlf 

V*-0 J 



(71) 



where 



( 72 ) 



and 



*(*) = 



\0.5g(k) k = 0 



8(k) 



otherwise 



R(n,k) = ^x(n- p + k)x*(n- p-k). 

p =- W 

Now an even window length is possible and the FFT algorithm may be employed. 
Figure 12 shows the CSD of the complex signal (42) using a Gaussian window. 



(73) 



D. GENERALIZED EXPONENTIAL DISTRIBUTION 

Boudreaux-Bartels and Papandreou [19] have noted that, in the ambiguity domain, the 
exponential kernel represents a poor filter. The kernel does not have a flat passband and 
rolls off very slowly. As a result, cross-terms near the origin are barely attenuated while 
auto-terms are distorted due to the narrow passband. Both of these problems can be 
remedied by using a lowpass filter with a flat passband and a narrow transition region to 
the stopband. 




Figure 12. CSD of two complex sinusoids. Equation 
(42), using a Gaussian window 
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In place of the ED, Boudreaux-Bartels and Papandreou have proposed the 
Generalized Exponential Distribution (GED) characterized by the kernel function 



where Wand M are positive constants representing the kernel order and 0, and x, are 
scaling constants. Comparing Figure 13 with Figure 8, the GED kernel shows a much 
flatter passband and steeper roll off. The GED includes the ED as a special case where 
N = M = 1 and = C. Also, the GED shares all of the properties of the ED. 

The four parameters of the GED distribution completely control its shape in the 
ambiguity domain and are chosen in reference to the signal's structure in the ambiguity 
domain. The parameter's role in shaping the filter can be investigated by considering the 
one-dimensional filter 



which is plotted in Figure 14 for various values of N. As N increases, the passband 
flattens and the transition region narrows. The scaling factor, x r determines the width of 




( 74 ) 




( 75 ) 





Figure 13. Contour plot of the GED's kernel function in the 
ambiguity domain. 
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the passband region. A set of criteria have been developed for determining the optimum 
set of parameters from a given set of constraints on the passband and stopband. 

Due to the complexity involved, the GED is implemented solely in the ambiguity 
domain. The ambiguity is first calculated and then multiplied by the kernel. Taking the 
two-dimensional FFT of the product gives the GED. 

E. REDUCED INTERFERENCE DISTRIBUTIONS 

As seen earlier, for a time-frequency distribution to have the properties listed in Table 
1 , its associated kernel must satisfy the constraints listed in Table 2. In addition, to 
suppress cross-terms the kernel must be a lowpass filter in the ambiguity domain. Jeong 
and Williams [4] have introduced a subset of Cohen's generalized distribution, the 
reduced interference distribution (RID), that has all of the above desirable characteristics. 
By considering the limitations above imposed on the kernel, they have also introduced a 
simple design process to produce RID kernels and thus, members of the RID class. 

The design process proposed by Jeong and Williams to design a RID kernel consists 
of three steps. 

Step 1: Determine a real- valued, primitive function h(t) such that: 




Figure 14. Variation of the lowpass filter given by Equation (74) 
with the parameter N. 
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Rl: h( t) has unit area. 



jh(t)dt = 1. (76) 

R2: h(t) is symmetrical about the origin, h(t) = h(-t). 

R3: h(t) is only non-zero in the interval [-'/i.'/i], 

R4: h(t) has little high frequency content so that H(Q) represents a lowpass filter. 

Step 2: Take the Fourier transform of h(t), 

7/(6) = \h{t)e- ,2l *‘dt. (77) 

Step 3: The RID kernel is obtained by replacing 0 in H(0) with 0x, 

4> wd (0,t) = //(0t). (78) 

In Step 1, the requirements R1-R3 ensure that the resulting kernel will meet the 
constraints listed in Table 2. Condition Rl implies that constraints Q1-Q3 will be met 
since 

H( 0) = J h{t)dt = 1 (79) 

and the argument of H(6t) becomes zero whenever 0 or x are zero. Condition R2 
ensures that H(0) will be real which in turn implies that Q6 holds. Also, Q10 and Q1 1 
are also implied unless the derivatives fail to exist. Condition R3 ensures Q8 since 

\|/(M)= jtKO.T)^ 2 " 8 ' 




= 0 if |t| < 2|r|. 
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Since a similar relationship holds for 4^(0/), Q9 also holds. As the resulting kernel is not 
a function of time or frequency, Q4 and Q5 are satisfied by default. However, Q7 will 
not be satisfied. 

Condition R4 requires h(t) to be lowpass in the frequency domain. Then, performing 
the substitution 0x for 0 in Step 3, the resulting kernel is guaranteed to be lowpass in the 
ambiguity domain and thus suppresses cross-terms. However, meeting R4 involves a 
trade-off since suppression is purchased at the expense of lower auto-term resolution due 
to temporal smoothing. 

There are two useful methods for determining a good choice for the primitive 
function h(t). First, any of the popular windows used in spectral analysis represent 

reasonable choices as long as they are truncated in time to satisfy R3. Another method is 
to use the windowing technique found in FIR filter design. The kernel <J)(0,t) = //(0x) is 

first specified in the ambiguity domain. Reversing Step 3, h(t) is found from the inverse 
Fourier transform of H(Q). Then, h(t) is multiplied by a rectangular window whose 
support is limited to [~V2,W\ so that R3 is satisfied. 



After h(t) has been designed, the RID is derived from Cohen's generalized distribution 
(25) as 



The RID can be interpreted as the Fourier transform of a generalized autocorrelation 
function 



RlD(t,f:h)=] ]^hi—\x(u + -)x'(u--)e- j2ic *dudx. 




(81) 




(82) 



where 




( 83 ) 
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Literally an infinite number of likely primitive functions, h(t), exist. For example, the 



function 



h (t) = rect(f) = 



f 1 M<K 

lO otherwise. 



(84) 



leads to the Bom-Jordan distribution [6], Another example is the triangular function 



h(t) = j 

which gives the kernel 



2-41 \,\<a 

0 otherwise 



(85) 



4>(6,x) = sinc 2 |^j. 



( 86 ) 



None of the previously discussed kernels qualify as RID kernels although each can be 
derived from the process above by relaxing some of the requirements for h(t). 



F. SUMMARY OF PROPERTY SUPPORT 

Table 3 allows comparison of each time-frequency distribution discussed above in 
terms of how they support the desired properties listed in Table 1. 



TABLE 3: PROPERTY COMPARISON OF TIME-FREQUENCY DISTRIBUTIONS 



Distribution 


PI 


P2 


P3 


P4 


P5 


P6 


FI 


P8 


P9 


P10 


Pll 


Wigner 


X 


X 


X 


X 


X 


X 




X 


X 


X 


X 


Choi-Williams 


X 


X 


X 


X 


X 


X 








X 


X 


CSD (ZAM) 








X 


X 


X 




X 


X 


X 


X 


GED 


X 


X 


X 


X 


X 


X 








X 


X 


RID 


X 


X 


X 


X 


X 


X 




X 


X 


X 


X 
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V. ADAPTIVE RADI ALLY -GAUSSIAN KERNEL 



A. BACKGROUND 

Even though most of the kernels presented in Chapter IV can be adjusted to give 
varying amounts of cross-term suppression, they do not actively take into account the 
nature of the signal itself. Since the positions of the auto-terms and cross-terms depend 
on the signal, fixed kernels can only be expected to give good results for a limited class 
of signals. For example, consider the ED, GED, and the RID. To preserve the time and 
frequency marginals, their kernels are constrained to be unity along the x and 0 axes. For 
a signal composed of complex sinusoids, such as presented in Figure 2, these 
distributions give good results. Generally these distributions work best with signals 
whose auto-terms closely parallel either the x or 0 axes. However, for signals whose 
auto-terms do not lie on either axis, such as the signals in Figures 3 and 4, worse results 
are obtained. Clearly, to handle a broad class of signals the kernel function must be 
made signal-dependent even at the expense of sacrificing a few desired properties in the 
resulting distribution. 

A procedure for designing signal-dependent kernels has been proposed by Baraniuk 
and Jones [20], Their procedure consists of choosing an optimal kernel from a class of 
kernels defined by a series of constraints. Example constraints include forcing the kernel 
to be lowpass to suppress cross-terms or ensuring that the kernel preserves the time and 
frequency marginals. The optimal kernel is the one which maximizes some particular 
performance measure. Depending on the kernel constraints and the performance measure 
chosen, different optimal kernels are realized. 

Baraniuk and Jones chose as the basis for their adaptive kernel the class of radially- 
Gaussian functions, 

6 2 +T 2 

<D(0,x) = /^, (87) 

which may also be expressed in polar coordinates as 



36 






( 88 ) 




The shape of the kernel depends only on a(\| /), the spread function, which controls the 
spread of the kernel along a radial line oriented at angle vy, where 



V = arctan 




(89) 



Clearly, optimizing the kernel consists of obtaining an optimal spread function. Figure 
15 shows an example spread function and Figure 16 shows the resulting kernel. 

Given the class of kernels above, Finding the optimal kernel becomes an optimization 
problem. The optimal kernel is defined as the one whose spread function yields 



2 *~ 

max J J|A(r,\j/)(}>(r,y)|‘rGM\j/ (90) 

* oo 

where A(r,\j/) is the polar representation of the ambiguity function. Furthermore, the 
optimization is subject to the constraints 

r 2 

<l>(r,\|/) = e 2<T(y) (91) 




Figure 15. An example spread vector 
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Figure 16. Radially-Gaussian kernel corresponding to the 
spread vector shown in Figure 15 




and 

. 2 n « 

— J j|([)(r,V|/)| 2 rJrJ\j/<a, a> 0 . ( 92 ) 

2 * 0 0 

Substituting (91) into (92) and recognizing that since the ambiguity function is 
symmetric about the origin, the spread function only has to be determined over the range 
[0,rc). The latter constraint simplifies to 

-J-?a 2 (\j/)</v|/<a. (93) 

2*1 

The purpose of the performance measure, (90), and the constraints (91)-(93), is to 
give an optimal kernel which suppresses cross-terms while the distortion of auto-terms is 
minimal. Equation (91) limits the kernel to the class of radially-Gaussian kernels, which 
are lowpass in nature and thus suppress cross-terms. The constraint on kernel volume, 
(92) or (93), controls the trade-off between cross-term suppression and auto-term 
smearing. Increasing the volume raises the probability that auto-terms are passed by the 
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kernel unattenuated while also raising the probability that some cross-terms pass. 
Decreasing the volume performs the opposite. The recommended range for a has been 
given as 



l<oc<5 



( 94 ) 



where the lower bound arises from the volume of the spectrogram kernel. Fixing the 
upper bound consists of finding the best compromise between smearing the auto-term of 
a simple Gaussian signal and passing the majority of its energy. Appendix B shows the 
derivations for the lower and upper bounds. 

B. IMPLEMENTATION 

The first step in implementing a solution for the optimal kernel is to rewrite the 
performance measure, (90), and the constraints, (91 and (93), in the discrete domain. 
Sampling the radially-Gaussian kernel on a polar grid gives 



where p and q represent the radial and angular indices, and A r and A y are the 

corresponding step sizes. The discrete spread function is a positive vector consisting of 
samples from the spread function, 



Using the polar grid defined in (95), the optimal discrete kernel is the one whose spread 
vector yields 




(95) 



p = 0,...,P-\, q = 0,...,Q-\, 




(96) 



max 






(97) 



<F 0 P= l 



subject to the constraints 
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(98) 



(P*rf 



4 'p{p*q) = e 



and 




(99) 



Calculating the performance measure, (97), requires that the ambiguity function be 
sampled on a polar grid, A p (p,q). Two approaches can be taken. First, the polar-sampled 
ambiguity function can be calculated directly as shown in [21]. Alternately, the polar 
samples can be interpolated from a rectangularly-sampled ambiguity function. The 
interpolation is performed by first defining a polar grid. At each point (r,\| /) on the grid, 
the point is converted back into equivalent rectangular coordinates. Next, the four 
nearest neighbors bounding the point are found and bilinear interpolation [22] is used to 
estimate the ambiguity function at the desired point. For greater accuracy, the closest 
three out of the four bounding points can be used to perform two successive linear 
interpolations [21] [22], Due to the symmetry of the ambiguity function, only the upper 
half of the ambiguity plane need be sampled. 

Restating the optimization problem, the goal is to find the spread vector 
a = [a 0 ... a c _,] which maximizes the function 



subject to the constraint (99). An appropriate method for solving this problem is the 
steepest ascent algorithm. 



e-‘ p - i . , 2 

f(a) = A]A^^p\Ap(p,q)^ p (p,q)\ 



q=0 p=\ 




( 100 ) 



< 7=0 p=\ 



a(k + l) = q(k) + \i¥f{k). 



( 101 ) 



where the present guess is updated in the direction of the gradient 
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T 



( 102 ) 



¥/(*) = 



a/ 

d<*i(*) 



a/ 

d<W*) 



The step size, |i, is chosen to ensure to ensure the most rapid convergence without being 
so large that the process becomes unstable. For (100), the elements of the gradient vector 
are 



r A 4 A P-l (p&r 

9 / V 3 | a ( \| 2 



oTik) 



q = 0,...,Q-\. 



(103) 



The algorithm is initialized with 



2 ( 0 ) = 





(104) 



which represents a circularly symmetric kernel of volume a. 

The steepest ascent algorithm does not include the volume constraint (99). Since the 
gradient is always positive, the volume of the kernel increases with each iteration. To 
keep the volume constant at a, the spread vector is scaled after each iteration by 



o(* + 1) <- 2(* + 1) - v .. . 005) 

This operation forces the kernel volume to a without changing the shape of the spread 
vector. 

Once the optimal spread vector is found, the optimal radially-Gaussian kernel is 
generated in rectangular coordinates using (87). Since only samples of the spread vector 
are available, interpolation is necessary to give a smooth kernel. Typically a simple 
cubic spline gives excellent results. Then the time-frequency distribution is given by a 
two-dimensional DFT of the generalized ambiguity function, which in turn is the product 
of the optimal kernel and the rectangularly-sampled ambiguity function. Figure 17 
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Figure 17. The optimal kernel generated for the complex 
signal given by Equation (42) 

shows the optimal kernel found for the complex signal (42) and Figure 1 8 shows the 
resulting time-frequency distribution. 

The adaptive radially-Gaussian kernel demonstrates excellent results for a broad class 
of signals but has some drawbacks. First, this technique is computationally much more 
expensive than the fixed kernels shown in Chapter Four. Convergence to an optimal 
spread vector is slow; typically, about thirty iterations are needed. Also, interpolating to 
find the polar-sampled ambiguity function and later generating the optimal kernel are 
time-consuming. The second drawback is the sacrifice of desirable properties in the 
resulting time-frequency distribution. Of the properties listed in Table 1, the optimal 
kernel only guarantees preservation of signal energy, shift invariance, and a real-valued 
distribution. Baraniuk has shown how to extend the optimization procedure outlined 
above to include additional constraints on the kernel such that marginal and time support 
properties are guaranteed by the optimal kernel [21]. 
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Figure 18. Time-frequency distribution generated from 



the optimal kernel for the complex signal 
given by Equation (42) 
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VI. COMPARISON OF TIME-FREQUENCY 
DISTRIBUTIONS 



A. EXPERIMENTAL ANALYSIS 

This section compares the performance of the fixed and adaptive kernel time- 
frequency distributions using several classes of synthetic analytic signals. The intent is to 
identify any additional benefits or drawbacks for each specific distribution, such as 
resolution, and demonstrate how the distributions obey the properties listed in Table 1. 
Distributions considered here are: 

• Wigner-Ville 

• Choi-Williams 

• ZAM 

• Signal dependent distribution using the Adaptive Radially-Gaussian Kernel 
Seven test signals are used to evaluate each distribution. Each signal consists of 512 

points and is analytic. In each case, the time-frequency surface is built using 64-point 
DFTs and is displayed as either a contour or mesh plot, depending on which best displays 
the surface features. To keep the plot size manageable, the time window was moved in 
steps of eight data points, resulting in a 64 x 64 surface. The test signals are: 

1. Two-component analytic sinusoid where the components are close in frequency. 




2. Frequency shift keyed signal, 





385 < n <512 



129<«<384, 



44 



3. Mono-component FM linear chirp, 



n J.L+±JLX 
x(n) = e 64 512 ", 



4. Two component signal composed ot two parallel, FM linear chirps. 



;2n{— + — ;2i/— + - - 

x(n) = e 64 5127 +c " w 64 5127 , 



5. Mono-component FM quadratic chirp, 






6. Multi-component signal composed of two FM quadratic chirps and an analytic 
sinusoid, 



x(n) 






7. Noisy signal composed of two FM linear chirps crossing in the time-frequency 
plane with an SNR of 3 dB, 



x(n) = 4e 






+ r\(n). 



B. TEST SIGNAL ONE 

Test signal 1 consists of two analytic sinusoids, spaced very closely in frequency, and 
is used to demonstrate the frequency resolution of the different distributions. The 
Wigner distribution. Figure 19(a), only weakly suggests the presence of two frequencies. 
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Oscillatory modulation due to the cross-term is imposed on the auto-terms. With the 
exponential distribution. Figure 19(b), the cross-terms is smoothed out slightly and the 
auto-terms are more prominent. The presence of two distinct frequency components is 
clearly visible using a c of 1. Lower values cause the auto-terms to mere due to 
smoothing while larger values give the same result as the Wigner distribution. The ZAM 
distribution, Figure 20(a), shows only a single smoothed, oscillatory frequency 
component and fails to resolve the sinusoids. Finally, the optimal distribution. Figure 
20(b), gives a very sharp spectrum indicating the presence of three frequency 
components. These are the auto-terms and a smoothed cross-term. The optimal kernel 
does not completely remove the cross-term due to its close proximity to the auto-terms 
on the ambiguity plane. 

C. TEST SIGNAL TWO 

Test signal 2 is a frequency shift keyed signal and is used to demonstrate how the 
distributions differ in frequency support. The Wigner distribution. Figure 21(a), shows 
violation of frequency support as well as cross-terms between each frequency transition. 
This effect can be minimized by reducing the size of the window but at the expense of 
frequency resolution. For the exponential distribution. Figure 21(b), the cross-terms 
have been eliminated and the leakage of spectral energy to other frequency bins is much 
improved over the Wigner distribution. The ZAM distribution, Figure 22(a), gives the 
best results as expected due to its emphasis on preserving time and frequency support. 
Spectral leakage is minimal; only a small widening of the peaks is observed at the 
beginning and end of each spectral component. Finally, the optimal distribution gives 
very disappointing results. While the distribution has no evident cross-terms, the surface 
is covered by artifacts due to the optimal kernel's lack of time and frequency support. 
These artifacts can be reduced by decreasing the volume of the kernel. 
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Figure 19. Wigner and exponential distributions for test signal 1 

(a) Wigner distribution 

(b) Exponential distribution 
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Figure 20. ZAM and optimal distributions for test signal 1 

(a) ZAM distribution 

(b) Optimal distribution 
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(a) 




Figure 21. Wigner and exponential distributions for test signal 2 

(a) Wigner distribution 

(b) Exponential distribution 
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(a) 




(b) 



Figure 22. ZAM and optimal distributions for test signal 2 

(a) ZAM distribution 

(b) Optimal distribution 
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D. TEST SIGNAL THREE 



Test signal 3 consists of a single FM linear chirp. All of the distributions give 
excellent results and track the instantaneous frequency of the signal accurately. For large 
values of G, the Wigner distribution, Figure 23(a), and the exponential distribution. 
Figure 23(b), give virtually identical results. Reducing o smoothes the chirp in the 
exponential distribution. The ZAM distribution, Figure 24(a), gives very sharp results. 
Lastly, the optimal distribution, Figure 24(b), gives a narrow, smoothed chirp but with 
declining magnitude near the beginning and end of the chirp. Increasing the volume of 
the optimal kernel decreases the latter effect. 

E. TEST SIGNAL FOUR 

Test signal 4 consists of two parallel, FM linear chirps. The Wigner distribution. 
Figure 25(a), shows the expected cross-term midway between the auto-terms. Compared 
to the auto-terms, the cross-term is more oscillatory and of greater magnitude. At the 
cost of smoothed auto-terms, the exponential distribution, Figure 25(b), is able to greatly 
reduce the cross-term by smearing its energy across the region between the chirps. Here, 
a a of five sufficed to give good results. Both the ZAM distribution. Figure 26(a), and 
the optimal distribution. Figure 26(b), removed the cross-terms entirely while retaining 
sharp auto-terms. Again, the optimal distribution demonstrates a slow buildup and decay 
at the beginning and end of the chirps. 

F. TEST SIGNAL FIVE 

Test signal five consists of a single FM quadratic chirp whose instantaneous 
frequency changes rapidly. All of the distributions performed well and tracked the 
instantaneous frequency accurately. The Wigner distribution. Figure 27(a), gives a 
narrow, oscillatory representation that sharpens at higher instantaneous frequencies. 
Similar results are obtained with the exponential distribution, Figure 27(b), although the 
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(b) 



Figure 23. Wigner and exponential distributions for test signal 3 

(a) Wigner distribution 

(b) Exponential distribution 
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(b) 

Figure 24. ZAM and optimal distributions for test signal 3 

(a) ZAM distribution of test signal 3 

(b) Optimal distribution of test signal 3 



53 





Figure 25. Wigner and exponential distributions for test signal 4 

(a) Wigner distribution 

(b) Exponential distribution 
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Figure 26. ZAM and optimal distributions for test signal 4 

(a) ZAM distribution 

(b) Optimal distribution 
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ridge representing the chirp is slightly smoother. The ZAM distribution, Figure 28(a), 
gives a sharp representation but the magnitude of the ridge decreases with increasing 
instantaneous frequency. Apparently, the Gaussian window used with the cone-shaped 
region of support tends to attenuate signal energy when the signal is highly 
nonstationary. Finally, the optimal distribution. Figure 28(b), gives a smoother 
representation than the other distributions. Like the ZAM distribution, the ridge 
decreases in magnitude with instantaneous frequency but to a smaller extent. 

G. TEST SIGNAL SIX 

Test signal 6 is a complex multi-component signal with both nonstationary and 
stationary components. These components include two FM quadratic chirps and a 
complex sinusoid. As expected, the Wigner distribution. Figure 29(a), produces a 
complicated spectrum due to the presence of cross-terms between the three signal 
components. The cross-terms are similar in magnitude to the auto-terms, making 
interpretation difficult. Going to the exponential distribution, Figure 29(b), improves the 
distribution at the cost of some smoothing. Still, the distribution shows that a small 
number of artifacts persist, especially where the quadratic chirps cross. The ZAM 
distribution, Figure 30(a), gives the best results. All of the cross-terms are suppressed 
with little apparent smoothing. As mentioned earlier, the magnitude of the quadratic 
chirps decreases with increased instantaneous frequency. Also, the ZAM distribution 
totally suppresses the crossover of the quadratic chirps. The optimal distribution, Figure 
30(b), gives very disappointing results. While the sinusoid shows up strongly, the 
quadratic chirps are smeared and inconsistent. Evidently, the optimal kernel favored the 
sinusoid over the chirps in the ambiguity domain and thus partially suppresses the chirps. 
Increasing the kernel volume leads to less suppression but greater leakage of cross-terms 
as Figure 30(b) indicates. This indicates that finer sampling of the ambiguity function is 
needed for spectrally dynamic signals to avoid competition between components. 
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Figure 27. Wigner and exponential distributions for test signal 5 

(a) Wigner distribution 

(b) Exponential distribution 
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(a) 




(b) 

Figure 28. ZAM and optimal distributions for test signal 5 

(a) ZAM distribution 

(b) Optimal distribution 
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Figure 29. Wigner and exponential distributions for test signal 6 

(a) Wigner distribution 

(b) Exponential distribution 
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(b) 

Figure 30. ZAM and optimal distributions for test signal 6 

(a) ZAM distribution 

(b) Optimal distribution 
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H. TEST SIGNAL SEVEN 



Test signal 7 consists of two FM linear chirps, which cross on the time-frequency 
plane, embedded in Gaussian white noise. The SNR is 3 dB. Except for the Wigner 
distribution. Figure 31(a), all of the distributions suppress the noise so the spectral 
features of the signal are evident. The Wigner distribution is unrevealing. Using a a of 
five, the exponential distribution. Figure 31(b), removes the noise sufficiently that the 
chirps are clearly visible. Both the ZAM distribution, Figure 32(a), and the optimal 
distribution. Figure 32(b), do a superior job of suppressing noise. Once again, the ZAM 
distribution appears to suppress the signal near the crossing of the chirps. For the 
optimal distribution, the kernel volume controls the rejection of the noise power. Figure 
32(b) used a volume of two; a lower value tends to attenuate the auto-terms. 
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(b) 



Figure 31. Wigner and exponential distributions for test signal 7 

(a) Wigner distribution 

(b) Exponential distribution 
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(a) 




(b) 



Figure 32. ZAM and optimal distributions for test signal 7 

(a) ZAM distribution 

(b) Optimal distribution 
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VII. RECOMMENDATIONS AND CONCLUSIONS 



In this thesis, the ability of time-frequency distributions to represent the time-varying 
spectral characteristics of nonstationary signals has been examined. Working with 
Cohen's generalized class of distributions (25), the relationship of the kernel's properties 
to those of the resulting distribution was shown. Of these, the most important is the 
ability to suppress cross-terms arising from the bilinear structure of (25). If the spectral 
components of the signal are obscured on the time-frequency plane by cross-terms, the 
remaining properties listed in Table 1 become unimportant. One way to suppress cross- 
terms is via a two-dimensional lowpass kernel in the ambiguity domain. Chapter Four 
showed various kernels and their resulting distributions. However, better performance 
for certain classes of signals can be realized by using a lowpass kernel that adapts to the 
signal's structure in the ambiguity domain as shown in Chapter Five. 

Examining the results shown in Chapter Six for various synthetic analytic signals, the 
Wigner distribution, the most widely known example of Cohen's generalized distribution, 
suffers from cross-terms. For multicomponent or noisy signals, very poor results are 
obtained. The exponential distribution represents a distinct improvement but trades off 
smoothing of auto-terms for removing of cross-terms. Both of these distributions also do 
not offer time or frequency support (see Table 3.) The ZAM distribution performed well 
for all of the signals shown in Chapter Six and in addition had the ability to display the 
spectral features of signals embedded in noise. 

The optimal kernel implemented here shows great promise but needs a few alterations 
for the best results. For the synthetic signals shown in Chapter Six, the optimal kernel 
was able to remove cross-terms while retaining narrow spectral peaks in the time- 
frequency plane. The optimal kernel was superior in resolving the spectral features of a 
noisy, multicomponent signal. However, disappointing results were obtained for the 
FSK and multicomponent signals. As implemented, the optimal kernel makes no attempt 
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to satisfy the time and frequency support properties or the marginal support properties. 
These properties should be included in the optimization routine as additional constraints 
and their effects examined. The poor results shown for the multicomponent signal can 
probably be alleviated through finer sampling of the ambiguity function. 

In the future, the RID class of kernels should be examined further to understand the 
implications in choosing different windows. The RID potentially offers the best 
performance of any fixed kernel. For the adaptive kernel, work needs to be focused on 
finding a less expensive method for obtaining an optimal kernel. At the present, 
determining the optimal distribution as compared to the ZAM distribution takes an 
amount of time which is two orders of magnitude larger. One possible approach is to 
base the spread vector on the amount of energy present at each sample angle and scale to 
give the desired kernel volume. In this manner, the expensive optimization step can be 
skipped. 
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APPENDIX A. MATLAB SOURCE CODE 



The MATLAB code used in generating the distributions examined in Chapter Six is 
listed below. All data is assumed to be analytic. Real data should first be translated into 
its associated analytic signal by using the techniques discussed in Chapter Four. 



1. Wigner-Ville Distribution 

function PS = wvd (data, winlen, step, begin, theend) 

% PS = wvd (data, winlen, step # begin, theend) 

% 

% 'wvd.m' returns the Wigner-Ville time- frequency distribution 
% for the input data sequence. Window length and time step size 
% are determined by the user but the window length should be a 
% power of two. By default the entire data sequence is used but 
% user may specify specific intervals within the data by using 
% 'begin' and 'end*. 

% 

% data: input data sequence 
% winlen: window length 
% step: time step size 

% begin: desired starting point within data 
% theend: desired ending point within data 

[m,n] = size(data); 
if m > n 

data = data . ' ; 

end 

datalen = length (data) ; 

% use user specified starting and ending points if present 
start = 1; 
finish = datalen; 
if nargin == 5 
if begin > 1 

start = begin; 

end 

if theend < datalen 
finish = theend; 

end 

end 

% initialize data spaces 

data = [zeros (1, winlen/2) data zeros ( 1 ,winlen/2 )] ; 
prod = zeros (1 , winlen/2 + 1) ; 
corr = zeros (1, winlen) ; 

PS = zeros (1 , winlen) ; 

% calculate distribution 
index = 1; 

for n = (winlen/2 ) +start : step: (winlen/2) +finish 

% calulate instantaneous correlation function 

prod = data (n-winlen/2 :n) . *conj (fliplr (data (n :n+winlen/2 ))) ; 
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corr = [prod conj ( f liplr (prod (2 : winlen/2 ) ) ) ] ; 
corr(l) = 0; 

PS (index,:) = f ft (corr); 
index = index + 1; 



end 



2. Exponential Distribution 

function PS = cw (data, tauwin, muwin, step, sigma) 

% 

% PS = cw (data, tauwin, muwin, step, sigma) 

% 

% Implements exponential kernel via RWED algorithm 
% 

% data: data sequence 
% tauwin: outer window length 
% muwin: inner window length 
% step: time step size 

% sigma: parameter for exponential kernel; a smaller 
% value reduces the cross -terms more 



[m,n] = size(data); 
if m > n 

data = data . ' ; 

end 

datalen = length (data) ; 

%initialize data spaces 
winlen = tauwin/2 + muwin/2; 

data = [zeros (1, winlen) data zeros ( 1 , winlen) ] ; 
corr = zeros (1 , tauwin) ; 

PS = corr; 

mu = -muwin/2 + 1: muwin/ 2 - 1; 

% Apply Choi-Williams RWED 
line = 1; 

for n = l+winlen:step:datalen+winlen 
index = 2; 

for tau = -tauwin/2 + 1: tauwin/2 -1 

% calculate smoothed autocorrelation function 
if tau ~= 0 

scale = 1/ (sqrt (4*pi*tau A 2/sigma) ) ; 
gwin = exp ( -sigma *mu . *2/ (4*tau A 2) ) ; 
f ilt = gwin. Mata (n+tau+mu) . *conj (data (n-tau+mu) ) ; 
corr (index) = scale*sum ( f ilt ) ; 
else 

corr (index) = data (n) . *conj (data (n) ) ; 

end 

index = index + 1; 



end 

PS (line,:) = f ft (corr) ; 
line = line + 1; 

end 
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3. ZAM Distribution 



function PS = zam (data, winlen, step) 

% 

% PS = zam (data, winlen, step) 

% 

% ' zam.m‘ returns the ZAM time- frequency distribution for 
% the input data sequence. Window length and time step size 
% are determined by the user but each should be a power of 
% two. A Gaussian window is used to filter the 
% autocorrelation estimate where the variance is chosen 
% such that the window has a value of 0.0001 at its endpoints. 
% 

% data: data sequence 
% winlen: window length 

[m,n] = size(data); 
if m > n 

data = data. ' ; 

end 

datalen = length (data) ; 

%initialize data spaces 
maxlen = winlen - 1; 

data = [ zeros ( 1 , 2*maxlen) data zeros (1 , 2*maxlen) ] ; 
corr = zeros ( 1 , winlen) ; 

PS = corr; 

% determine Gaussian window 
k = [0 : maxlen] ; 

alpha = -log (0.0001) / (2*maxlen /N 2 ) ; 
gwin = exp (-2*alpha*k. ^2) ; 
gwin(l) = 0.5*gwin(l); 



% Apply ZAM method 
line = 1; 

for n = 1 + 2 *maxlen: step: datalen + 2*maxlen 

% find local autocorrelation 
for tau = 0:maxlen 
mu = [-tau: tau]; 

corr(tau+l) = sum (data (n-mu+tau) . *conj (data(n-mu- 
tau) ) ) *gwin (tau+1) ; 

end 

PS (line, : ) = 4*real (f f t (corr) ) ; 
line = line + 1; 



end 



4. Optimal distribution 

Finding the optimal distribution within MATLAB requires three separate programs: 

POLTERP.M - performs polar interpolation of a rectangularly sampled ambiguity 
function 
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OPTKERN.M - given the polar-sampled ambiguity function and desired kernel 
volume, determines the optimum spread vector 

KERNEN.M - produces the optimal kernel given the optimal spread vector 

Once the optimal kernel has been found, the signal's ambiguity function is multiplied by 
the kernel. Taking the two-dimensional Fourier transform of this product gives the 
optimal distribution. 



function pol_af = polterp(af) 

% pol_af = polterp(af) 

% 

% 'polterp.m' perforins a polar interpolation of a 
% rectangularly sampled ambiguity function. By default 
% a 64x64 input matrix is assumed. 

% 

% pol_af: polar sampled ambiguity function 
% af : rectangularly sampled ambiguity function 

cx = 33; 
cy = 33; 

pol_af = zeros (31 , 31) ; 
delpsi = pi/31; 
af = af . *conj (af ) ; 

% copy points along tau-axis 
pol_af(l / :) = af (cy,cx+l:64) ; 

% interpolate, work along concentric circles of 
% increasing radius 
for rad = 1:31 

psi = delpsi; 
for ang = 2:31 

% calculate rectangular coordinates of current point 
x = rad*cos (psi) ; 
y = rad*sin(psi) ; 

% convert to array coordinates 

x = cx + x; 

y = cy - y; 

% find four nearest neighbors 

xl = floor (x); 
yl = floor (y); 

x2 = xl + 1; 
y2 = yl; 

x3 = x2 ; 
y3 = yl + 1; 

X4 = xl; 
y4 = y3 ; 
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% find the AF values of the neighbors 
af 1 = af (yl ,xl) ; 
af2 = af (y2 ,x2) ; 
af 3 = af (y3 , x3 ) ; 
af 4 = af (y4 , x4 ) ; 

% perform 2D interpolation via two ID interpolations 
% determine if point lies in upper or lower triangle 
% of neighborhood 

delx = x - xl; 
dely = y - yl; 

if delx > dely % upper triangle 

xp = x2; 
if x ~= xl 

yp = y2 + (y - yl) / (x - xl) ; 
else 

yp = y2; 

end 

afp = af2 + (af3 - af2)*(yp - y2); 
else % lower triangle 

yp = y3; 

if y ~= yl 

xp = x4 + (x - xl) / (y - yl) ; 
else 

xp = x4 ; 

end 

afp = af4 + (af3 - af4)*(xp - x4 ) ; 



end 

% perform final interpolation 

dl = sqrt { (x - xl) A 2 + (y - yl) A 2); 
d2 = sqrt ( (x - xp) A 2 + (y - yp) A 2); 

pol_af (ang, rad) = afl + ((afp - afl)/(dl + d2))*dl; 
psi = psi + delpsi; 



end 



end 



function spv = optkern (pol_af , vol,mu) 

% spv = optkern (pol_af # vol # mu) 

% 

% 'spv.m' calculates the optimum spread vector using the 
% steepest ascent algorithm. 

% 

% spv: optimal spread vector 
% pol_af : polar sampled ambiguity function 
% vol : desired kernel volume (typically 1-5) 

% mu: step size (typically 25) 

% set up constants 
true = 1; 
false = 0; 

[maxa,maxr] = size (pol_af ) ; 
delpsi = pi/maxa; 
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converged = false; 
maxiter = 1 ; 
tol = 0.001; 

%pol_af = pol_af /max (max (pol_af ) ) ; 

% initialized spread vector 

spv = sqrt (2*pi*vol/ (maxa*delpsi ) ) *ones (maxa, 1) ; 



% enter steepest ascent loop 

while (converged == false) & (maxiter < 30) 

% calculate gradient for each angle 
grad = zeros (maxa, 1 ) ; 
for ang = 1 :maxa 

% sum over each radial point 
for rad = l:maxr 

term = rad^3*pol_af (ang # rad) *exp (-rad^2/spv (ang) ^2) 
grad (ang) = grad (ang) + term; 

end 

grad (ang) = delpsi*grad (ang) /spv (ang) ^3 ; 



end 

% update spread vector 
spv = spv + mu*grad; 

% scale spread vector to maintain constant volume 
spv = spv*sqrt (2*pi*vol/ (delpsi*sum (spv. ^2 ) ) ) ; 

% check for convergence 
if sum (grad) < tol 
converged = true; 

end 

maxiter = maxiter + 1; 



end 

% prepare spread vector for kernel generation 
spv = [ spv ' spv ( 1 ) ] ; 



function kernel = kerngen ( spread) 

% kernel = kerngen (spread) 

% 

% ' kerngen. m' generates the optimal kernel given the optimal 
% spread vector. By default, the kernel is 64x64. 

% 

% kernel: optimal kernel 
% spread: optimal spread vector 

delq = pi/ (length (spread) - 1) ; 
max_m = 31; 
max_n = 16; 

kernel = zeros ( 16 # 31 ) ; 

% generate the upper- half of the kernel 
n = 1; 

for i = 0:31 
m = 1 ; 
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for j = -32:31 

% convert rect . coordinates to polar 
rsq = ( (i/8) ^2 + (j/8) A 2); 
q = atan2 (i , j ) /delq + 1; 

% smoothly interpolate the spread vector for the current angle 
var = spline (1 : length (spread) , spread, q) ; 
kernel (n # m) = exp (-rsq^2/ (2*var A 2 ) ) ; 
m = m + 1; 



end 

n = n + 1; 



end 

% generate the rest of the kernel using symmetry 
kernel = [ f lipud (kernel) ' f liplr (kernel (2 :32 , :))']' ; 
kernel = [zeros (64,1) kernel ; 
kernel (:,1) = zeros (64,1); 
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APPENDIX B. DERIVATIONS OF VOLUME LIMITS 
ON THE OPTIMAL KERNEL 



As stated previously, the constraint on kernel volume controls the trade-off between 
cross-term suppression and the smoothing of auto-terms. A low kernel volume increases 
the probability that cross-terms will be suppressed and that the auto-terms will be 
smoothed. Increasing the volume gives sharper auto-terms but raises the probability that 
some portion of the cross-terms pass. Baraniuk and Jones have recommended as a 
general guideline that the kernel volume be in the range [20] [21] 

l<a<5. (106) 

A derivation of these limits follows. 

A lower limit is placed on the kernel volume by restricting the smoothing of the auto- 
terms. A reasonable limit is to allow no more smoothing than the spectrogram. The 
volume of the spectrogram kernel, 4^(0, x), is given by 



^ j J|<M9,Tf<»H4> 5 (o,of = 1 



(107) 



assuming the average energy of the window is unity. Thus, the lower bound on the 
kernel volume is 



a>l. 



(108) 



The upper limit is motivated by the desire to limit smoothing without passing 
significant cross-term energy. To develop a reasonable upper limit, Baraniuk and Jones 
consider the Gaussian signal 



x(t) = 



1 "7 

y* e • 



n 

For this signal, the ambiguity function is 

_(p+ri 

A(0,t) = e 4 

and the optimal kernel is given by 



(109) 



( 110 ) 
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(Ill) 



e ; +t 2 

^(0.x) = e" 40 

Taking the two-dimensional Fourier transform of the generalized ambiguity function 
gives the optimal time- frequency distribution. 






a 

n(l + a) 




( 112 ) 



To quantify the amount of smoothing as a function of the kernel volume, the radius r. of 
the circular contour where the auto-term has decayed to e~ l is calculated. 



=f±. 

V a 



(113) 



For large a, r e approaches unity indicating no smoothing but with complete passage of 
all cross-terms. Decreasing a increases r t , indicating greater smoothing and a smaller 
chance of passing cross-terms. This relationship is shown in Figure 33. For values of q 
significantly greater than one, the amount of smoothing does not change significantly but 
the chance of passing cross-terms increases rapidly. A reasonable upper bound occurs at 
the knee point of (113), or 

q<5. (114) 




Kernel Volume 



Figure 33. Plot of Equation (112) 



74 



List of References 



1. Steven M. Kay, Modem Spectral Estimation: Theory and Estimation, pp. 51-54, 
Prentice-Hall, 1988. 

2. Richard B. Altes, "Detection, Estimation, and Classification with Spectrograms," 
J. Acoust. Soc. Amer., Vol. 67, No. 4, pp. 1232-1246, April 1980. 

3. W. Mecklenbraucker, "A Tutorial on Non-Parametric Bilinear Time-Frequency 
Signal Representations," Les Houches, Session XLV, 1985, J. L. Lacoume and R. 
Stora, eds., Elsevier Science Publishers B.V., 1987. 

4. J. Jeong and W. J. Williams, "Kernel Design for Reduced Interference 
Distributions," IEEE Trans. Acoustic, Speech, and Signal Processing, Vol. 40, 

No. 2, pp. 402-412, February 1992. 

5. L. Cohen, "Generalized phase-space distribution functions," J. Math. Phys., Vol. 
7, pp. 781-786, 1966. 

6. L. Cohen, "Time Frequency Distributions - a Review," Proc. IEEE, Vol. 77, No. 
7, pp. 941-981, July 1989. 

7. A. W. Rihaczek, Principles of High-Resolution Radars, McGraw-Hill, 1969. 

8. P. Flandrin, "Some features of time-frequency representations of multicomponent 
signals," Proc. ICASSP, San Diego, CA, Mar. 1984, pp. 41B.4. 1-4.4. 

9. E. P. Wigner, "On the quantum correction for thermodynamic equilibrium," Phys. 
Rev., Vol. 40, pp. 749-759, 1932. 

10. J. Ville, "Theorie et applications de la notion de signal analytique," Cables et 
Transmission, Vol. 2 A, pp. 61-74, 1948. 

11. B. Boashash, "Notes on the use of the Wigner Distribution for Time-Frequency 
Signal Analysis," Trans. IEEE ASSP, Vol. 36, No. 9, pp. 1518-1521, September 
1988. 



75 



12. T. A. C. M. Claasen and W. F. G. Mecklenbrauker, "The Wigner Distribution - A 
Tool for Time-Frequency Analysis, Part I: Continuous Time Signals," Philips 
Journal of Res., Vol. 35, pp. 217-250, 1980. 

13. T. A. C. M. Claasen and W. F. G. Mecklenbrauker, "The Wigner Distribution - A 
Tool for Time-Frequency Analysis, Part II: Discrete Time Signals," Philips 
Journal of Res., Vol. 35, pp. 276-300, 1980. 

14. F. G. Stremler, Introduction to Communication Systems, pp. 259-261, Addison- 
Wesley, 1990. 

15. H. I. Choi and W. J. Williams, "Improved Time-Frequency Representations of 
Multicomponent Signals Using Exponential Kernels," IEEE Trans. Acoustic, 
Speech, and Signal Processing, Vol. 37, no. 6, June 1989. 

16. Y. Zhao, L. Atlas, and R. Marks, "The Use of Cone-Shaped Kernels for 
Generalized Representation of Nonstationary Signals," IEEE Trans. Acoustic, 
Speech, and Signal Processing, Vol. 38, no. 7, pp. 1084-1091, July 1990. 

17. T. N. Comsweet, Visual Perception, Academic Press, 1970. 

18. S. Oh and R. J. Marks, II, "Some Properties of the Generalized Time Frequency 
Representation with Cone-Shaped Kernel," IEEE Trans. Signal Processing, Vol. 
40, no. 7, pp. 1735-1745, July 1992. 

19. G. F. Boudreaux-Bartels and A. Papandreou, "On a Generalization of the Choi- 
Williams Time-Frequency Distribution," Proceedings of the Twenty-Fifth 
Asilomar Conference on Signals, Systems, and Computers, pp. 364-369, 
November, 1991. 

20. R. G. Baraniuk and D. L. Jones, "A Radially Gaussian, Signal-Dependent Time- 
Frequency Representation," IEEE ICASSP '91, May 1991. 

21. R. G. Baraniuk, "Shear Madness: Signal-Dependent and Metaplectic Time- 
Frequency Representations," Doctoral Thesis, University of Illinois at Urbana- 
Champaign, 1992. 



76 



22. W. H. Press, B. P. Flannery, S. A. Teukolsky, and W. T. Vetterling, Numerical 
Recipes in C, pp. 104-106, Cambridge University Press, 1988. 



77 



INITIAL DISTRIBUTION LIST 



No. of Copies 

Defense Technical Information Center 2 

Cameron Station 

Alexandria, Virginia 22304-6145 

Library, Code 52 2 

Naval Postgraduate School 
Monterey, California 93943-5000 

Chairman, Code EC 1 

Department of Electrical and Computer Engineering 
Naval Postgraduate School 
Monterey, California 93943-5000 

Professor Ralph D. Hippenstiel, Code EC/Hi 3 

Department of Electrical and Computer Engineering 
Naval Postgraduate School 
Monterey, California 93943-5000 

Professor Monique P. Fargues, Code EC/Fa 2 

Department of Electrical and Computer Engineering 
Naval Postgraduate School 
Monterey, California 93943-5000 

Professor Charles W. Therrien, Code EC/Ti 1 

Department of Electrical and Computer Engineering 
Naval Postgraduate School 
Monterey, California 93943-5000 

Professor Roberto Cristi, Code EC/Cx 1 

Department of Electrical and Computer Engineering 
Naval Postgraduate School 
Monterey, California 93943-5000 

Naval Ocean Systems Command 1 

Attn: Dr. C. E. Persons, Code 732 
San Diego, CA 92152 

LT Robert E. Parker, Jr. 2 

SWOS 

Naval Amphibious Base 
Coronado, CA 92118 



78 



DUDLEY KN.O v 
NAVAL POv: 
MONTEREY v 






A 




GAYLORD S 



